An Open Source Approach to Medium-Term Data Archiving

نویسندگان

  • Sherine Antoun
  • John Fulcher
  • Carole Alcock
چکیده

Mediumto long-term archiving of digital documents, beyond the lifespan of the authoring software/hardware, is a challenging problem. Magnetic and optical media are susceptible to environmental influences and deteriorate over time, often to the point where the archived documents can no longer be retrieved. Previous attempts to address this problem include migration and emulation, both of which have their attendant difficulties. It is the contention of the present study that an Open Source approach offers several advantages. More specifically, by archiving the Open Source application programs (in source code, not executable form) along with the documents in question, in both plain and compressed form, significantly increases the likelihood of being able to retrieve such archives at some future time. The application source code can be recompiled to a form suitable for reading in (Open Source) viewers, thereby presenting to the user the archived document as the original author envisaged it. One set of experiments was undertaken distributing documents together with their (Open Source) authoring software via a Portable Virtual Machine (PVM) program to unused disk space on a network of SUN workstations. The success of this approach was evaluated using the following four measures: (i) lossiness of conversion, (ii) edit-ability, (iii) ability to save back to the original format, and (iv) functionality retention. Another series of experiments was conducted in which artificial (‘speckle’ or salt-and-pepper) noise was deliberately introduced to the archived documents in order to mimic degradation of the storage medium over time. It was found that survivability was heavily dependent on file type: simple text files and MPEG movies were impervious to even 18% introduced noise. Source code programs and JPEG images, by contrast, were intolerant to even the smallest noise levels (it has to be said however that straightforward re-editing of the former led to error-free compilation without much difficulty). Lastly, it was found that decompression (specifically the publicly available RAR decompressor) further enhanced the file recovery process. We conclude that an Open Source approach to the preservation of digital archives has considerable potential. Sherine Antoun, John Fulcher, Carole Alcock

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data for the future: The German project "Co-operative development of a long-term digital information archive" (kopal)

Purpose: One of the unresolved problems of the global information society is ensuring the longterm accessibility of digital documents. The project kopal tackles this problem head-on: In a threeyear project kopal’s objective is the practical testing and implementation of a cooperatively created and operated long-term archival system for digital resources. Design/methodology/approach: The system ...

متن کامل

Extended Poster Abstract: Open Source Solution for Massive Map Sheet Georeferencing Tasks for Digital Archiving

Scanned maps need to be georeferenced, to be useful in a GIS environment for data extraction (vectorization), web publishing or spatially-aware archiving. Widely used software solutions with georeferencing functionality are designed to suit a universal scenario for georeferencing many different kinds of data sources. Such general nature also makes them very time-consuming for georeferencing a l...

متن کامل

Study of Solute Dispersion with Source/Sink Impact in Semi-Infinite Porous Medium

Mathematical models for pollutant transport in semi-infinite aquifers are based on the advection-dispersion equation (ADE) and its variants. This study employs the ADE incorporating time-dependent dispersion and velocity and space-time dependent source and sink, expressed by one function. The dispersion theory allows mechanical dispersion to be directly proportional to seepage velocity. Initial...

متن کامل

Effective Triggers and Barriers of Self-archiving Behavior Displayed by Knowledge and Information Sciences’ faculty members in Iran

Background and Aim: The present investigation was carried out in order to study the self-archiving behavior displayed by Knowledge and Information Sciences (KIS) faculty members in Iran. It intended to discover the incentives and barriers impacting on this behavior as well as arriving at a baseline for predicting the extent of self-archiving. Method: A descriptive survey method was deployed. Th...

متن کامل

Priming Effect of on the Enhancement of Germination Traits in Aged Seeds of Chamomile (Matricaria chamomilla L.) Seeds Preserved in Medium and Long-term Storage

Chamomile (Matricaria chamomilla L.) is a widely used medicinal plant possessing several pharmacological effects due to presence of active compounds. In order to study of seed priming effects on seedling growth of chamomile, an experimental design, based on randomized complete design with three replications was conducted under greenhouse conditions in Research Institute of Forests and Rangeland...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006